The DEMOSTHeNES speech composer

نویسندگان

  • Gerasimos Xydas
  • Georgios Kouroupetroglou
چکیده

In this paper we present the design and development of a modular and scalable speech composer named DEMOSTHeNES. It has been designed for converting plain or formatted text (e.g. HMTL) to a combination of speech and audio signals. DEMOSTHeNES' architecture constitutes an extension to current Text-to-Speech systems’ structure that enables an open set of module-defined functions to interact with the under processing text at any stage of the text-tospeech conversion. Details on its implementation are given here. Furthermore, we present some techniques for text handling and prosody generation using DEMOSTHeNES.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DEMOSTHeNES Speech Composer

In this paper we present the design and development of a modular and scalable speech composer named DEMOSTHeNES. It has been designed for converting plain or formatted text (e.g. HMTL) to a combination of speech and audio signals. DEMOSTHeNES' architecture constitutes an extension to current Text-to-Speech systems’ structure that enables an open set of module-defined functions to interact with ...

متن کامل

Text-to-speech scripting interface for appropriate vocalisation of e-texts

Electronic texts carry important meta-information (such as tags in HTML) that most of the current Text-to-Speech (TtS) systems ignore during the production of the speech. We propose an approach to exploit this meta-information in order to achieve a detailed auditory representation of an e-text. The e-Text to Speech and Audio (e-TSA) Composer has been designed and developed as an XML based scrip...

متن کامل

Prosody Modelling for Syllable-based Speech Synthesis

Prosody model used in the syllable based speech synthesizer DEMOSTHENES is described in the paper. The paper focuses on the segmental structure, especially on the segmentation into rhythm units (prosodic phrases). Relations between prosodic segments and sentence constituents are also discussed.

متن کامل

Real Time Speech Recognition Using DSK TMS320C6713

Speech recognition is an important field of digital signal processing. Automatic Speaker Recognition (ASR) objective is to extract features, characterize and recognize speaker. Mel Frequency Cepstral Coefficients (MFCC) is most widely used feature vector for ASR. MFCC is used for designing a text dependent speaker identification system. In this paper the DSP processor TMS320C6713 with Code Comp...

متن کامل

What concept-to-speech can gain for prosody

This article proposes a concept-to-speech system with automated prosody learning based on reinforcement learning. The concept-to-speech system, named Demosthenes, is an extension of the text-to-speech system DreSS. Demosthenes is responsible for template-based text generation and symbolic prosody prediction, while DreSS takes care of acoustic prosody and speech synthesis. The prosody predictor ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001